1 TTGCTGACTC ATGTGCCCGC AGCTAGCAGG AGCTGGCAGC ATGGGCTCCC 
51 CAGGGGCTAC GACAGGCTGG GGGCTTCTGG ATTATAAGAC GGAGAAGTGG 
101 GCTCTCCTCG CCAAAAAAGG CTACCAGGAG CGGGACCTGG AACCCCAGTT 
151 TTC CAT CATC ACCAAACTCA AAGGGGTTTC CGTCACTCAG ATCAAGGAGC 
201 TTGGAAACCG GCTGTGGGAT GTGGCCGACT TCGTGAAGCC ACCTCAGGGA 
251 GAGAACGTGT TCTTCTTGGT GACCAACTTC CTTGTGACGC CAGCCCAAGT 
301 TCAGGGCAGA TGCCCAGAGC ACCCGTCCGT CCCACTGGCT AACTGCTGGG 
351 TCGACGAGGA CTGCCCCGAA GGGGAGGGAG GCACACACAG CCACGGTGTA 
401 AAAACAGGCC AGTGTGTGGT GTTCAATGGG ACCCACAGGA CCTGTGAGAT 
451 CTGGAGTTGG TGCCCAGTGG AGAGTGGCGT TGTGCCCTCG AGGCCCCTGC 
501 TGGCCCAGGC CCAGAACTTC ACACTGTTCA TCAAAAACAC AGTCACCTTC 
551 AGCAAGTTCA ACTTCTCTAA GTCCAATGCC TTGGAGACCT GGGACCCCAC 
601 CTATTTTAAG CACTGCCGCT ATGAACCACA ATTCAGCCCC TACTGTCCCG 
651 TGTTCCGCAT TGGGGACCTC GTGGCCAAGG CTGGAGGGAC CTTCGAGGAC 
701 CTGGCGTTGC TGGGTGGCTC TGTAGGCATC AGAGTTCACT GGGATTGTGA 
751 CCTGGACACC GGGGACTCTG GCTGCTGGCC TCACTACTCC TTCCAGCTGC 
SOI AGGAGAAGAG CTACAACTTC AGGACAGC C A CTCACTGGTG GGAGCAACCG 
851 GGTGTGGAGG CCCGCACCCT GCTCAAGCTC TATGGAATCC GCTTCGACAT 
901 CCTCGTCACC GGGCAGGCAG GGAAGTTCGG GCTCATCCCC ACGGCCGTCA 
951 CACTGGGCAC CGGGGCAGCT TGGCTGGGCG TGGTCACCTT TTTCTGTGAC 
1001 CTGCTACTGC TGTATGTGGA TAGAGAAGCC CATTTCTACT GGAGGACAAA 
1051 GTATGAGGAG GCCAAGGCCC CGAAAGCAAC CGCCAACTCT GTGTGGAGGG 
1101 AGCTGGCCCT TGCATCCCAA GCCCGACTGG CCGAGTGCCT CAGACGGAGC 
1151 TCAGCACCTG CACCCACGGC CACTGCTGCT GGGAGTCAGA CACAGACACC 
1201 AGGATGGCCC TGTCCAAGTT CTGACACCCA CTTGCCAACC CATTCCGGGA 
1251 GCCTGTAGCC GTTCCCTGCT GGTTGAGAGT TGGGGGCTGG GAAGGGCGGG 
1301 GCCCTGCCTG GGGAT CTCAA GGATGAGGCC CCAGCATGGA GGATTGGGGG 
1351 TAGAATTCCA CCCTTGAACC CCAGCAGACA GTCCCTCCCC TGACTCCCAC 
14 01 CTTGGTAGGG TGCTGCCTCA GGGAGCCATA GAAGTCGGCT GTGTTTTGAG 
1451 ACGGCGACAG AACCTGACCC GTGGAGACTG GGAGAGCCCA GCAGGCACCT 
1501 GTATTGCAGG GCTCCGACTG CATGTGGCAG GGGCTCCTGC TGCGTCTGGG 
1551 CCTGGAGGTC TCTCTCCCAG TGCTCTGTCC CCAGTGTTCC TAGCAGAGGT 
1601 ATGCTTACCA GCTGTCAGCA CAGACCCTCC TGCTGCCTGG GTCCTGGCCC 
1651 TCCTCCCCCA TCTGCACCCC CATCATAGGT AGAGACCCCA CCCTCCCATC 
1701 GGTCCTACAT GGGGCTGTGC AGCTGGAGCC AAAAAGGCAA GGCAGAAAGA 
1751 GGAGTGATGG GGGAGGGGGA TTGTTTCAGC TTCTCTGGTG CTGTGATGCC 
1801 CCAGGAGAGT CCTAATCTAG GGAATGGGGT GGAGTAGGCA GATAAT CC AC 
1851 CTCCCTATCC CCCAGGCAAG GGCGGAGCAT GTGTCTTGGG CCCACACCTG 
1901 CTTAGTTTAT GAGGACCGGC TGCTTTCCAG TGGTAGCCCT TTTGCCATGG 
1951 AGGTCTGGGA GAGAGAGCAG AGGGCGGCAG GGCTAAGTTG GTGATCATTG 
2001 GGTTCTTCAG GACCTTCTAT ATCCCTCCTC GGTAACCCCC CAGCCCAACC 
2051 CCTTGGAATC TTTCCTCCAG GCTTCCTGAG AGCCCTGGGG GTGGGAGGCT 
2101 GTGGGAGGCT GTACATCTGA AATTCACTTC AGTCCAAGTC ATACCTAGGA 
2151 AGCTGTCTGG GCAGCTGCTC GAGGGAGGCC CTGGCTCTGA TCCCAGGCTG 
2201 GATGGAGTGG CTGGAAGGAA TGGTTCCAAA CAACACCACC GAGATCTCCC 
2251 TCAGGCTGGC CAGGTTTTGC AGCTGGAATT CTCCTCTTGG TCCCAGGGCG 
2301 GGGCAGGGAA TTCTAAGTGT CCACCCCAGG GAGGCAAGGG GCTGCTTTCC 
2351 ACTGTGGGTA CCTGGTGATC AGGGCAAGCT GTGGAGGGCC AGGGGTGGGG 
2401 CTGAGACTGG GCTGACATCT AGAATCACCT GCCACCTGGA GCCTCAGTAA 
2451 AATGCCTGGG GTCCCTGCTG CCTCTCAATC TCCAGAGCCA TGTCCATGGG 
2501 GAGGTGGGCT CTGAAGGGCG AAGGTGGGAG AGCAGGGCCC CTGAGGCCTG 
2551 GGTATCCAAG GAGGGGCACG TGCACCTGAT TCTCCTTGGG GCCCAGAGGA 
2601 AGCTGATGTC ATGGCTGGAC AAAGTCACGG AGTAAAGCCA GCAAAGCCAC 
2651 CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 
(SEQ ID NO: 1) 

FEATURES : 

5 f UTR: 1-40 

Start Codon: 41 

Stop Codon: 1256 

3'UTR: 1259 
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HOMOLOGOUS PROTEIN: 

Top 10 BLAST Hits: 

Sequences producing significant alignments: 

(18000005098398 /altid=gi | 4885535 /def =ref | NP_005437 . 1 | puri. 
| 335001098681202 /altid=gi | 11417813 /def =ref | XP_009854 . 1 | pu. 
[1000682348238 /altid=gi | 6469324 /def =gb | AAF13303 . 1 | AF065385 . 
| 18000005129684 /altid=gi | 6754966 /def =ref | NP_035158 . 1 | puri. 
[18000005027891 /altid=gi | 6981322 /def =ref |NPJ)3 6853 .1 | p2X6 . 
| 148000001425983 /altid=gi | 7920253 /def =gb| AAF70599 . 1 | AF2050 . 



18000005038217 /altid=gi 

18000005027890 /altid=gi 

18000005064403 /altid=gi 

18000005196095 /altid=gi 



7447773 /def =pir j | S71344 purinergi . 
1709522 /def=sp|P51578|P2X5_RAT P2 . 
4505549 /def=ref |NP_002551.l| puri. 
4099121 /def=gb|AAD00553.l| {U8399. 





E 


(bits) 


Value 


QC7 

OO / 


Pi t\ 

u . u 


857 


0.0 


855 


0.0 


621 


e-177 


604 


e-172 


360 


2e-98 


348 


8e-95 


345 


7e-94 


318 


9e-86 


318 


9e-86 



EST: 

Sequences producing significant alignments: 
gi | 11617343 /dataset=dbest /taxon=96... 
gi | 6992441 /dataset =dbe st /taxon=960 . , . 
gij 4990980 /dataset=dbest /taxon=9606 
gij 10325489 /dataset=dbest /taxon=96 . . . 
gij 2195075 /dataset=dbest /taxon=9606 



Score 
(bits) 
1164 
648 
579 
464 
287 



E 

Value 
0.0 
0.0 
e-163 
e-128 
4e-75 



EXPRESSION INFORMATION FOR MODULATORY USE: 

gi | 11617343 Brain- anaplastic oligodendroglioma 

gij 6992441 Chronic lyphocytic leukemia 

gij 4990980 Lung- carcinoid 

gij 10325489 lung - large cell carcinoma 

gij 2195075 Colon 



Tissue expression: 
Whole brain 
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1 MGSPGATTGW GLLDYKTEKW ALLAKKGYQE RDLEPQFSII TKLKGVSVTQ 
51 IKELGNRLWD VADFVKPPQG ENVFFLVTNF LVTPAQVQGR CPEHPSVPLA 
101 NCWVDEDCPE GEGGTHSHGV KTGQCWFNG THRTCEIWSW CPVESGWPS 
151 RPLLAQAQNF TLFIKNTVTF SKFNFSKSNA LETWDPTYFK HCRYEPQFSP 
201 YCPVFRIGDL VAKAGGTFED LALLGGSVGI RVHWDCDLDT GDSGCWPHYS 
251 FQLQEKSYNF RTATHWWEQP GVEARTLLKL YGIRFDILVT GQAGKFGIiIP 
301 TAVTLGTGAA WLGWTFFCD LLLLYVDREA HFYWRTKYEE AKAPKATANS 
351 VWRELALASQ ARLAECLRRS SAPAPTATAA GSQTQTPGWP CPSSDTHLPT 
401 HSGSL (SEQ ID NO: 2) 

FEATURES : 

Functional domains and key regions: 

[1] PDOC00001 PS00001 ASN_GLYC 0 S YLAT I ON 
N-glycosylation site 

Number of matches: 3 

1 129-132 NGTH 

2 159-162 NFTL 

3 174-177 NFSK 



[2] PDOC00004 PS00O04 CAMP_PHOSPHO_SITE 

cAMP- and cGMP- dependent protein kinase phosphorylation s 
368-371 RRSS 



[3] PDOC00005 PS 00 005 PKC_PHOSPHO_SITE 
Protein kinase C phosphorylation site 

Number of matches : 2 

1 17-19 TEK 

2 131-133 THR 



[4] PDOC00006 PS00006 CK2__PHOSPHO_SITE 
Casein kinase II phosphorylation site 

Number of matches: 2 

1 217-220 TFED 

2 336-339 TKYE 



[5] PDOC00008 PS00008 MYRISTYL 
N-myristoylation site 

Number of matches: 10 



1 




2-7 


GSPGAT 


2 


5 


-10 


GATTGW 


3 


45 


-50 


GVSVTQ 


4 


113- 


118 


GGTHSH 


5 


119- 


124 


GVKTGQ 


6 


130- 


135 


GTHRTC 


7 


146- 


151 


GWPSR 


8 


225- 


230 


GGSVGI 


9 


297- 


302 


GLIPTA 


10 


306- 


•311 


GTGAAW 



[6] PDOC00932 PS01212 P2X_RECEPTOR 
ATP P2X receptors signature 

225-251 GGSVGIRVHWDCDLDTGDSGCWPHYSF 

Membrane spanning structure and domains: 
Helix Begin End Score Certainity 

1 69 89 0.782 Putative 

2 299 319 1.835 Certain 



BLAST Alignment to Top Hit: 

>CRA| 18000005098398 /altid=gi | 4885535 /def =ref | NP_005437 . 1 [ 

purinergic receptor P2X-like 1, orphan receptor; P2X 
specifically expressed in skeletal muscle; purinoceptor 
P2X6 [Homo sapiens] /org=Homo sapiens /taxon=9606 
/dataset=nraa /length=431 
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Length = 431 
Score = 857 bits (2189), Expect =0.0 

Identities = 405/431 (93%), Positives = 405/431 (93%), Gaps - 26/431 (6%) 



Query : 


1 


MGSPGATTGWGLLDYKTEK WALLAKKG YQERDLE 


34 






MGS PGATTGWGLLDYKTEK WALLAKKGYQERDLE 




Sb j ct : 


1 


MGS PGAT TGWG LLD YKT E KYVMTRNWRVGALQRLLQFG I WYWGWAL LAKKG YQERDLE 


60 


Query: 


35 


PQFS I ITKLKGVSVTQIKELGNRLWDVADFVKPPQGENVFFLVTNFLVTPAQVQGRCPEH 


94 






PQFS 1 1 TKLKGVS VTQ I KELGNRLWD VAD FVKP PQGENVF FLVTNF LVT PAQ VQGRC P EH 




Sb j Ct : 


61 


PQFS I ITKLKGVS VTQ I KELGNRLWDVADFVKPPQGENVFFLVTNFLVTPAQVQGRCPEH 


120 


Query: 


95 


PSVPLANCWVDEDCPEGEGGTHSHGVKTGQCWFNGTHRTCEIWSWCPVESGWPSRPLL 


154 






PSVPLANCWVDEDCPEGEGGTHSHGVKTGQCWFNGTHRTCEIWSWCPVESGWPSRPLL 




Sb j ct : 


121 


PSVPLANCWVDEDCPEGEGGTHSHGVKTGQCWFNGTHRTCEIWSWCPVESGWPSRPLL 


180 


Query: 


155 


AQAQNFTLFIKNTVTFSKFNFSKSNALETWDPTYFKHCRYEPQFSPYCPVFRIGDLVAKA 214 






AQAQNFTLFIKNTVTFSKFNFSKSNALETWDPTYFKHCRYEPQFSPYCPVFRIGDLVAKA 




Sb^ Ct : 


lol 


AQAQNFTLFIKNTVTFSKFNFSKSNALETWDPTYFKHCRYEPQFSPYCPVFRIGDLVAKA 240 


Query : 


215 


GGTFEDLALLGGSVGIRVHWDCDLDTGDSGCWPHYSFQLQEKSYNFRTATHWWEQPGVEA 


274 






GGTFEDLALLGGSVGIRVHWDCDLDTGDSGCWPHYSFQLQEKSYWFRTATHWWEQPGVEA 




Sbj Ct : 


241 


GGTFEDIiALLGGSVGIRVHWDCDLDTGDSGCWPHYSFQLQEKSYNFRTATHWWEQPGVEA 300 


Query: 


275 


RTLLKLYGIRFDI LVTGQAGKFGLI PTAVTLGTGAAWLGWTFFCDLLLLYVDREAHFYW 


334 






RTLLKLYGIRFDILVTGQAGKFGLIPTAVTLGTGAAWLGWTFFCDLLLLYVDREAHFYW 




Sbjct : 


301 


RTLLKLYGIRFDILVTGQAGKFGLIPTAVTLGTGAAWLGWTFFCDLLLLYVDREAHFYW 


360 


Query: 


335 


RTKYEEAKAPKATANS VWRELALAS QARLAE CLRRS S APAPT AT AAGS QTQT PGWPC P S S 


394 






RTKYE EAKAPKATANS VWRELALAS QARLAECLRRS S APAPT ATAAGS QTQTPGWPC P S S 




Sbjct: 


361 


RTKYEEAKAPKATANSVWRELALASQARLAE CLRRS S APAPT ATAAGS QTQT PGWPCPSS 


420 


Query : 


395 


DTHLPTHSGSL 405 








DTHLPTHSGSL 




Sb j ct : 


421 


DTHLPTHSGSL 431 {SEQ ID NO: 4) 





Hinmer search results (Pfam) : 

Scores for sequence family classification (score includes all domains) : 



Model 


Description 


Score 


E -value 


H 


CE00369 


E00369 P2X6_receptor 


1180.5 


0 


2 


PF00864 


ATP P2X receptor 


870.0 


7.4e-258 


1 


CE00207 


CE00207 PURINERGIC 


366.8 


5.9e-lll 


1 


CE00370 


E00370 P2X4_receptor 


336.8 


1.9e-109 


1 


CE00368 


E00368 P2X7_receptor 


124.1 


6.5e-36 


1 


PF00095 


WAP-type (Whey Acidic Protein) ' f our-disulf i 


8.7 


1.1 


1 


PF01841 


Transglutaminase- like super family 


6.0 


6.3 


1 


PF01368 


DHH family 


2.5 


6.8 


1 



Parsed for domains: 



Model 


Domain 


seq-f 


seq-t 


hmm-f 


hmm-t 




score 


E-value 


CE00369 


1/2 


1 


19 [. 


1 


21 


E. 


36 


3 


2 .le-ll 


PF00095 


1/1 


87 


Ill . . 


1 


40 


t. 


8 


7 


1.1 


PF01841 


1/1 


120 


130 . . 


1 


11 




6 


0 


6.3 


PF01368 


1/1 


221 


237 . . 


1 


19 


E. 


2 


5 


6.8 


CEO 03 6 8 


1/1 


54 


299 . . 


85 


333 




124 


1 


6.5e-36 


CE00370 


1/1 


20 


338 . . 


46 


372 




336 


8 


1.9e-109 


CE00207 


1/1 


20 


345 . . 


47 


393 




366 


8 


5.9e-lll 


CE00369 


2/2 


20 


351 . . 


48 


379 


.] 


1143 


5 


0 


PF00864 


1/1 


20 


354 . . 


34 


395 




870 


0 


7 .4e-258 
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1 TCTCCAAGTC CATGGGTGCC TGGTAGGAGA CAGGGGGATG AATGTGAACC 
51 CCTGCATGGC TATAGCCACC TGCCTCCTCC CCTGCGCTGC ATCACTACCT 
101 GGCCTATTTT TTGCCTCTAG AAGCACTGCT TCCTATGCTC CTTAGGACCA 
151 CTGCCCGCAT ATGACAGATA AGAACATCGA GGCTAAGGCA ACGCAAATCT 
201 TTTCCTTAAA GTCATACAGC TGTCAAAAGA AAGCTGGACA ACCTGGGCAA 
251 CATAGCGAGA TAAAAAATTA TTTAAATTAG CCAGATGTGG TAGCCCCCTG 
301 TAGTCTCAGC GACTCAGGAG GCTGAGGCAG GAGGCTCACC AGAGTGCAGA 
351 GTTCAAGGAT GCAGTGAGCT ATGATCCTGC CACTGCACTG AAAGCTGGGT 
401 GACAGAGCAA GACCCTGGCT CTAATAAATG AATACATAAA GTCTCACAGG 
451 TAGTGGTAGC TAATCCTGCC AGAGTCAGGC CTCTACCTGT CTGATGACAA 
501 ATGGCACACT ATGTCTTTTA ACCTGATTGC AGACCACAAA TGTTTTGTGA 
551 ATATTTTCCC CAGGGAAAAA ACCGGAAGTA GTTCTAAATT CTATACATCC 
601 ATTATATTAG TTTTACCTGT GGATTGGGAA AACCCAGCTC TGATTGCATT 
651 TCAGGGCGGG ACAGCCTTTG GTGCACTGTC TGGCGGGATT TTCCATTTTA 
701 ACCTCCTTCT AGAAGCGCCT TCTCATGGTA AAGTTCCTGA TGCCGCCAGG 
751 AGCGCCGAGG AGAGGGCAGG GGGCTGGAGA CGCCCCGCAG AGGGCTACGT 
801 GCCCTGCTGG ACAGAGGTCT CCTGCCTCCT CGGCGGCGCC AGCCCACCTC 
851 CCACAACCCC TGCGGGAGAA GCCCCCAAGG GGAGGAGACG GGCCTGGCCC 
901 CTGCCCCGAG CACCTTCCGT CTCTAGGTCG GAGTCTGAAT CGGCCTTGGG 
951 ACCCTGCTTG GCTTCGGGGA CCCCTGCAAG ACGTCCACAG GCCGCCGTCG 
1001 CCTCTTCCTC CTGCTTTTTA TCCTCCCCAG ACCTCTGGCA GGAACCGCTC 
1051 ATCGTTACGC CCCTTTCGCA GCCTCAGACC CTGAGGCGGA GACCGCTTGG 
1101 CGCCTCACTT AGAGCGCGAC CCGGGGATGT GGGCGGAGTC TGCGGCTGCG 
1151 CTGACCAATC GAGTGTGGCG TCCATCGACT GGCGTCTGCC ACGGCAATTA 
1201 GCGACGCGCT CCCCCGCGGC GGTCGCCCCG GCAACCCAGT GCTGTAGGTT 
1251 GCCGTAGAAA CCGTGGCTCT CCTGCGCTGA GGCTCCTCGC CTGAGAGGAT 
1301 AAACTGCACG CGCCACGGGC TATGCACTGG GCTGGGCGCC TTGTGGGCAT 
1351 CCTCCCTGCC TTCCTAGGGG GTTCCAGCAT CGCCCCCCTT TCGTGGACTG 
1401 GGAAACACGC CTGACTCCAG GACTTGTGTT GTCCTCACTG CACTGGGGAA 
1451 GGTGGCGGGG GCAGCTTTTC AGGAGGGCCT GGGGAACTTC GCAGAGCCAG 
1501 GTCACCCTCT CACTCTGTGC CTCTTAGTTA TCTTGCATGC TCTGGTCTTT 
1551 GCATACGCTG CTCCCTGCAC CAGGAACCTC CATCCCCATC TTTGTCTGCT 
1601 TGTCGAACTT CAGAAATCTG CAAGGGTCAG CTTAGAGGTC ACTTCTTCCG 
1651 GAAGCTTTCC TCAACACCCT CCCCGCCCTG CTGCTGCTGC CCTCAGGCCC 
1701 TCCTCTCACA GCACTGATAA CAGCTGTCCG TCTCCACCCT CCCACCACCT 
1751 CCACTCCCAC CCCAGGAAGT GAGGCCAGAG GGCAGGGACA GAGCTGCTGC 
1801 TGTTCTCTGT GTGCCAGGGC CCAGCAAAGG GAATGTAGGG AGGGTGGGAG 
1851 GTGCAGGGCA GCTGGGATTA GGGGTTGAGG GCTGGGTGTT GGAGGCTGGA 
1901 TCTGGATCCT GCTTTAGTGG AAGTGTCCCT TTAACAGCAA CTGGCCTGGC 
1951 CTGGCTCGGG CCCTGCTTTG CCTCCTGTTC AGCTGCGGCT GCAGCTGCCA 
2 001 TGCTGACTCA TGTGCCCGCA GCTAGCAGGA GCTGGCAGCA TGGGCTCCCC 
2051 AGGGGCTACG ACAGGCTGGG GGCTTCTGGA TTATAAGACG GAGAAGTATG 
2101 TGATGACCAG GAACTGGCGG GTGGGCGCCC TGCAGAGGCT GCTGCAGTTT 
2151 GGGATCGTGG TCTATGTGGT AGGGTAAGAG AGAAGAGCTT TTGGCCAGGC 
2201 TGGAGGGGCA AGGGAAGAGG TGGGGGGTGG GGCTTGGTCC TGCTGGGTTG 
2251 AAGTTGAGGG TTGGGCTGTT TAGGGGCTGG AGTGGAAGGG GGCAGATTGG 

23 01 GACGGGGTTG GGGAGAGCTA GGCGATACAA GACAGGAGAG CAAGAACAAG 
2351 CTGTGTGTTT GTCCTGTGTG TCCACTTGCC TCCTTCCCAG GGCCCCACCC 

24 01 AGGCCCCACC CAGGGGGCAC ATGACATAGT CCTTAACATC TGTGAGAGCT 
24 51 GGAGCACTAG GCCCCCAGAG AGACCACCAG CTGTATCTCG GGTCAGGAGA 
2501 GTCTGTAAGG GGGAAGCTGG ATCTAGTCAG GCTGGGGGTG GGTGCTGGCT 
2551 AGTGAAGGTG ATTGTCTGAG GGCATTGGCT CTCTGATGCA TGGCTGGAGC 
2601 TTCTGTCTCA TTCAGGGGGT CTGGAGTGGG AAGTGGGGCC AGAGAGGAGG 
2651 TGGGGCCTTC GATGTTGGGC CGGGAGCCTG TAGGGTGTGG GGGGAGAACT 
2701 GAGCATGTAG GGCTCAGCTC CGCCCCTGTC ACTACACGCT GGGGACACAC 
2751 CACACTGCCC GACTTCTCCT CCCCAGGTGG GCTCTCCTCG CCAAAAAAGG 
2801 CTACCAGGAG CGGGACCTGG AACCCCAGTT TTCCATCATC ACCAAACTCA 
2 851 AAGGGGTTTC CGTCACTCAG ATCAAGGAGC TTGGAAACCG GCTGTGGGAT 
2901 GTGGCCGACT TCGTGAAGCC ACCTCAGGTG GGGGCCCTGA TGTTGCTGAC 
2951 GGGGGCGCAA GTCCTTTCCC CACTGACAGC CTGAACACCC GCCATGCAGC 
3001 CAGTGTGTGC GAGAGAGAAG CATGTGATGC CAGAGACGGC TGCGGGTTCT 
3051 CAGGAAGGGC TTCACAGAGG AGTGGCACCT GGACAGGACT TTCAGGGATG 
3101 TGTAGGAGGT TTTGGGGTGG AAAAAGGGGC CACTCAAGAA GCCAGGCCAG 
3151 GGTTGGACGT GCTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA 
3201 GGCAGGTGGA TCACGAGATT GAGAGTATCC TGGCTAACAC GGTGAAACCC 
3251 CATCTCTATT AAAAATACAA AAAATTAGCC GGGCATGGTG GTGGGCGCCT 
3301 GTAGTCCCAG CTACTCGGGA GGCTGGGGCA GGAGAATGGC ATGAACCCGG 
3351 GAGGTGGAGC TTGCAGTGAG CCGAGATTGC ACCACTGCAC TCCAGCCTGG 
34 01 GTGGCAAAGC GAGACTCTGT CTCAAAAAAA AAAAAAAAAA GCCAGGCCAG 
3451 AGAAACTGCA TTTCCAAAGA CTGCCAACAG AAAAGAAGGG AGTGTCCAGG 
3501 ACTAATGGCT TGAGCTTGAG AGTGGTGTGA GGTGCTGGGG CATGGAACTT 
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3551 CCCTGTAGCC CTGCTCCCTG ACCTGGGGCA CTACGGTCAG GTGCTGCTCC 
3601 TCCCCTCTTC TCGGCTGCGT TTTCCTCCTC CCTCCACCCA GCTCATCCCC 
3651 AGCCTCAACT GCCACTTCTG CTCCTCTGAT GCCCAGGGTG TATTCCCAGT 
3701 GATCACCTGC CCAGAGCACA GCTGTCTTCT AGGTGCACAC CCACATGTCC 
3751 AAAGATCAAT TATTTTCCTC TCCTGGCATG GCCTCTGTGA CGCCCACTAG 
3801 TCATGGTGGC TGTGACATCC ACTAGTGCCT CAGCCAGACC CGTGACTCAC 
3851 CCTGGACCCC TTCCTGTCCC TTCCAAGATT TTTCACCACT ACCCATGCCA 
3901 TGCCATGCAT GAGACTATGG CCTCCTAGAG GGTCCCTAGA TGCCCCTCTC 
3951 GCCTCCTCCC TTACTGCTCG GTGCACACCA CGCAGCAGCC AAGCTGAACT 
4001 TTCACACCAG G CATCATG AG AGCCTGCAGC GCCTGCTTCT ACCCTCAGGA 
4051 ATTCCCCCAA CCCTGCCCAT GACGGTGTCC ACACTTTCCT CCCAATCCTA 
4101 ATGGCTGCCA CTCCCAGCAC CATCTGGCCA GCCCTCACCT TCCCTTCCTG 
4151 GGCATACATT CCCCAAATTC ACAGTGCTCT CACGAGCAGC ACTGGAGGGT 
4201 CAGCCTTTCT TTCCAATGTC CTCGGCCACC CGTTGACCAC AGACACAGCT 
4251 TTCCCTCTTC TCCCTTGGCC CCTGCCATGC CAGTGCTGCT GTGTGTGAGA 
4301 TGGGAGACTC ACCTCGTCTC CATCCTGAGC AGGTGCTGGG CCCAGCTCTC 
4351 CCTTGGATCT TCAGTACTAG AAGCAGCAGG CTGTTGGAAT ATTCTGGTTG 
4401 GAGCCAGGCA TGGTAGCTGG AGCCTGTAGT CCCAGCTACT TGGGAGGCTG 
4451 AGGCAGGAGG ACCTCTTGAG TCCAGGAGTT AGAGGTTGCA GTGAGCACTG 
4501 ATCACAACAC TACACTCCAG CCTGGGTGAC GAAGTGTAAT CCTGTCTCTA 
4551 AATACACACA TACACATGCA CACACACACA CAAATTTTGG TTGAGACAAG 
4601 AGACTTGTCT CAAGAGATGG ACATGGGCAC AAGGCTTCCT GGTCTCAAAA 
4651 ATGGCCAGAA CCACTGCCAG CCTCCCATCT CTGCTTCAGT CTGCCTTACA 
4701 GGGGGACAGG GTTAATGACT TGATGGGGCC AACATCCCTT CCCTCATAAA 
4751 CCAGGCTGCC GGCTTCCGGC CTTTCCAGTC AACACGAGCC CAGCCAGGCC 
4801 AACCTTGAGA CTTGCCTCCT AGGGAGAGAA CGTGTTCTTC TTGGTGACCA 
4851 ACTTCCTTGT GACGCCAGCC CAAGTTCAGG GCAGATGCCC AGAGGTGAGT 
4901 TTACCCAGGA TCCTCCCAGC GGGTCCCTTG TTCCTCCATC AGCCCCAGGT 
4 951 GGCCACCCGT GTTTCCCTTT CCCCTTCCCA GGTGGCTGAA GGCTCAGCCT 
5001 GTGCTCGGTG TCCCCCAGGC ACTGGGCTAC ATCTTTTCCT GAATCATTAT 
5051 GTTCAGTCTT CACATATCCC CTGCCTGGTA GGAAGTCCTG TGATCCCCAT 
5101 TTCAGAGGAG AAGACTGAGG CTCAGTGAGG TTGAGTCACT TTCTTAAGGC 
5151 CTCCAGGCCT GTGGGTGACA GGACCCCGAG CTCTGGGCAG CAGCAGTTCC 
5201 CATGAGGTGT CCAGGCCCTC CCATCCTGGT CCTGCCTCTG GGTACTCTCC 
5251 AGGTTGGTAG TGTGACACCC AGAGCTGCGC ACATGCTCAG GGAGGTTCTA 

53 01 ATAGCAAGAG CCAAGCTGGA ATATCACCTC CCCTTGTCTG TGCCCAGCCT 
5351 CTATTAATAT GTCCTGAGGC AGCTTTCATC TTTGTGGGCC AACACAGCAC 
5401 ACTCTTGCTC ATGGTGAATT CAGGATTGCT TATGATTTCT GGATAGTTTT 

54 51 TTTTGTTTTA TTTTTGAGAC GGAGTTTCAC TCTGTCACCC ACGCTGGAGT 
5501 GCAGTGGCAG ATATCAGCTC ACTGCAAGCT CTGCCTCTCA GGTTCACGCC 
5551 ATTCTCCTGC CTCAGCCTCC GGAGTAGCTG GTACTACAGG CGCCTGCCAC 
5601 CACGCCCAGC TAATTTTTTT TTTTTTTTGT ATTTT TAGTG GAGACGGGGT 
5651 TTCACGGTGT TAGCCAGGAT GGTCTCCATC TCCTGACCTC ATGATCCACC 
5701 TGCCTCGGCC TCCCAAAGTG CTGGGATTAC AGGCGTGAGC CACCACGCCC 
5751 GGCCTGATTT CTGGATAGTT TTTACATCAA CCGTGGTCAA GCCAGAGTCC 
5801 CCCACCTTGT TCTTCTTCAT TTCTGATCCA GAAATGCTGA TTCTCCCCCT 
5851 GACATTTCAC CTTTTCCCCT TGCCTGGGGA TGTCCCTGGG ATCCTGCATC 
5901 TGTCACAGAG CATGCTCATT CTCTCCAGCT GTGAATTTTG TTTGAACTAT 
5951 TGGGACTCAG GACATAGTCC TGAAAGTTTA CCTCCACAGT GACATCTTTA 
6001 GGCAAGTCCA ACATTTACGT GCCTCCTGGG CTGGAGGGTC GTTGTGCAGA 
6051 CAGCTGTCCC CTGAGCCCTG GTGGCTGGTC CTAGCACAGT TGCTGGAGAC 
6101 ATCCCATGTC CGTAGTTGGA AATATGCACA AAGGATTGCT TACTCTTTTT 
6151 GTTTGTTTGT TTTTTTGAGA TGGAGTCTTG CTCTTGTCCC CAAGGCTGGA 
6201 GTTCAATGGC ACGAT CTCGG CTCACTGCAA CCTCCGCCTC CTGGGTTCAA 
6251 GCAGTTCTCC TGCTCACCCC CTGAGTAGCT GGGATTACAG GTGCCCGCCA 
6301 CTGTGCCCAG CTAATTTTTG TATTTTAAGT AGAGACGGGG TTTCACCATG 
6351 TTGGCCAGGC TGGTCTCGAA CTCCTGGCCT CAGGTGACCC ACCAGCCTCG 
6401 GCCTCTCAAA GTGCTGGGAT TACAGGCGTG AGCCTGCCGA GAGCTTGGTC 
6451 GGGGAGACCT GAACCCAGCG GTGC TAAAGG AATTAAAGAC AAACACACAT 
6501 AAATATAGAG GTGTGGAGTG GGAAATCAGG GGTATCACAG CCTTCAGAGC 
6551 TGACAGCCTC GAACAGATTT ACCCACATAT TTATTGACAG CAAGCCAGTG 
6601 ATAAGCATTG TTTCTACCAG ATTATAGATT AACTAAAAGT ATTCCTTATG 
6651 GGAAACAAAG GGATGGGCTC TGGTTGGTTA TCTGCAGCAG GAGCATGTCC 
6701 TTAAATCACA GATCGCTCAT GCTATTGTTT GTGGTTTAAG AACGCCTTTA 
6751 AGCGGTTTTC CGCCCTGGGT GGGCCAGGTT TTCCTTGCCC TCATTCCGGT 
6801 AAACCCACAA ACTTCCAGTG TGGGTGTCGT GGCTATCACA AACATGTCAC 
6851 AGTGCTGCAG AGATTTTGTT TATGGCCAGA TTTTGGGGGC CTCTTCCCAA 
6901 CATGAGCCAC TGTGCCTGGC AGGATGTGCT TACTCTTGGT GAACCCACAC 
6951 AATGTCCTTC TCTTTCTTAA TGCT CAGATG TGCATTTAGT GTTCAGTTTG 
7001 TAGACCGTTC TGAAATTTGG CTGGATCTGT GGGTCTGTGT TTTTCAGAAT 
7051 CTGTGCAATT CCTCTTTGTC TGCAACCACA CTTCTGGCTC TTCCCATGAA 
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7101 ACGTCAGGGC TGGGTCGTAA TTATCAGATC TGACAACCTG GCTTTCCCGG 
7151 AAGACCAGAG TTCTGCCAGC TCCTCTAGGG ATCCTGGTGC CTGATCCCTC 
7201 CCTTACATGC ACCATGCTCT TTATAGTGTC ACCTCCCTCA GCAGACACCG 
7251 CTGAGCCTCC CCGCTGGGCC AGGGGGCTAG CTAGGCTAAA TTCACAAAAC 
7301 TCCATCTCCC ATACTTCAAA GACCACCCAC ATGGACAGCC CAGCCCAGGT 
7351 GGCAGGTCCG ATGATGGGAC AGAGGCTGTA GGTGGGGGAC CTAGGGCTGC 
7401 ACTTGAGCAG AATCTTTTTT TTTTTTTTCT TTTTTTTTTT TTTGAGACAG 
7451 AGTCTCGCTC TGTCACCCAG GCTGGAGTGC AGTGGCGTGA TCTCGGCTCA 
7501 CTGCACACCT CCACCTCCTT GGTTCAAGCG ATTCTCCTGC CTCAGCCTCC 
7551 CAAGTAGGTG GGAGTACAGG CACACACCAC CACACTCGGC TAATTTTTGT 
7601 ATTTTTAATA GAGACAGGGT TTTGCTGTGT CGGCCAGGCT GGTCTCGAAC 
7651 TCCTGACCTC AGGTAATCCG CCCACCTTGG CTTCTCAAAG TGTTGGGATT 
7701 ACAGGTGTGC CAGGCCAAGG AGAATCTTAA AAAAAGGTGG GGAGAAGCTG 
7751 GTGAGCAGGT GGATTTGGTT GAAGCAGGAT GTCGACACAG AGGGGGCTTG 
7801 GTGGGTAAAG GCCCTGAGCT GTGTGAGGTG AGGTGCCTTT AGGGCTACCT 
7851 GCCACTGGGT GGAGCTGAAG TGAAGATTTG GACTGGGGTG GGAAGAAGGT 
7901 AGTTCAGGAT TTCAGGGGCC CCTGTAAGCC CCACTAAGGA GCTAAACTGT 
7951 TTTTGTTTGT TTGTTTTCTT TTTCTCTTTT CTTTTTTTTC CTGTAGCAAT 
8001 GAGGTCTTGC TTTGTTGCCC AGGCTGGTCT CGAACTCCTG AGCTCAGGCA 
8051 ATCCGCCTAC TTTGGACTCT CAAAGTGCTA GGATTACAGG CGTGAGCCAC 
8101 TGTGCCTGGC AGGAGCTAAA CTTGATTAGA GGAACAGAAG AGAGCCACAC 
8151 GTGGGCTCAG AGGCAGGGTG CTCAGTTTCC TGCACATTGG GATGCACCAC 
8201 TTGGGCTGCT GGGCATAGGT GGATGAGGGT ATGGGAAGAC GTGGGGGCCC 
8251 CACTGGTGGT CACTGTGGGG TCTAGTTGGA GGAGACGGTA GCCCAGCTGG 
8301 GGTGAAGAGG AGAGGCAGAC ACAGGACATA GGTAGGGACA AAGAAGCAGA 
8351 GCATGTGGCT CTGCTCCGAC CTCCACCCAA TCACGACGGC CCTGTCTTTC 
8401 AGAAAGTCCC ACCGCCTCAT TCTGGCTTCT CAGAGGCCCT CAGCCTTCCT 
8451 TGCGCCCCTG GTGCTGGTGT TCTTCCTGCT GCCCCTGAGC TGAGTGCCCT 
8501 GGGCAGCAGT GTCCATCCTC AGTTGGGGCA GGACCATGCC TGGGAGAGTG 
8551 CCCGATGCTC AAGGGTGCCT TCGTCTCTGG GGTCTGGGAC CCCAGAAAGC 
8601 TCACCTGTCC TCCCCTTCTG CCAGAGCCCC ATAGTCCCAT GCCTCTGTGC 
8651 AGGCATTAAT GTCCCCAGGT TACAGAAGAG CGAGCAGGAA GGAGTAGCCT 
8701 GTGGTCCCTC AGCAAGGGTG TGGGGTCCTG CTTCAATACC CAAGCCCCTG 
8751 ACTCTAGGGC CCTGATCTTT GTCAGCTATG TCCCCATGCC GGGCATCAAA 
8801 AACTCACCCT CCCAAGGTAT CTTCACCTTC CCTGATCTGT CATCCAAATT 
8851 GGACCAGAGG AGCTAGACCT GGAAGAATCA CTTCCGCATC CACCAGGGAC 
8901 AGAACTGTCA GGAGGGAAGG GGCAGGGTGC GTTGTCTCAC GCCTGTAATC 
8951 CCAGCACTCT GGGAGGCTGA GACAGAAGGA TTGCTTGAGG CCAGGAGTTA 
9001 AAAACCAGCC TGGTCAACAT AGCAAGACTC CATCTCTACA AAAAAAAAAT 
9051 ATTAAAAAAT CAGCCAGGCA CAGTGGTGTG TGTCTGTAGT CCCAGCTACT 
9101 GGGAATACTG AGGTGAGAGG AT TGCTT AAG CCCGGGAGGG CGAGGCTGTA 
9151 GTGAGCCATG ATCATACCAC TGCACTAGAG CCTGGACAAC AGAGTGAGAC 
9201 CGAATCACTA AAAATAAATT TTTTGAAAAA GGAGGAAAGG GGTCTCCCTT 
9251 TGTCTTTGAA ATACAGTACT GTACCTTCAT CTGGCCAGGG CATTGCTCCG 
9301 CTCCCTCCTC TGACCACCTC CTTTTATTTG CACCCTCCAG CTTTCCTGTG 
9351 TGGCCCCACA CTCAGGGTAC TCTGGCGGCG GGGTGGTGAG GTTGTTTAAG 
9401 GTGGGAAGGG GGCCTGTCCT TCCCACCTTG AACCTCCCTG CCTTTGAGAC 
9451 TGGGCTGTGG AGGGGAGACA TCCCCTGTGC CATTGGTGAC TGCTCTCTCT 
9501 CCCACCTCAG CACCCGTCCG TCCCACTGGC TAACTGCTGG GTCGACGAGG 
9551 ACTGCCCCGA AGGGGAGGGA GGCACACACA GCCACGGTAA CTGTGGGCTC 
9601 TGTCTTCCAG TGCCCCTAGC AGGGTGGGGG CCGGGCTGGG ATCCTGGGTG 
9651 GCTCCTGAGT GCAGGCCCTG CTCGCCTCTG TCCCTGCATC TCTCTTTCTG 
9701 CCAACAACCC CCTGGCTGAA GGCCTCCCCA GGCCTGCAGA GATTTGAAGG 
9751 TCTGGAGTTC ATCTTTTGTT TTCTAGGTGT AAAAACAGGC CAGTGTGTGG 
9801 TGTTCAATGG GACCCACAGG ACCTGTGAGA TCTGGAGTTG GTGCCCCGTG 
9851 GAGAGTGGCG TTGTGCCCTC GTAAGTGTCC CCACAATCCC CTACCCCAAC 
9901 TGGCGCAGGG CCCCAGGCCT GGCAGAGGCT GTCACCTCCC TTCCACCTGC 
9951 AGGAGGCCCC TGCTGGCCCA GGCCCAGAAC TTCACACTGT TCATCAAAAA 
10001 CACAGTCACC TTCAGCAAGT TCAACTTCTC TAAGTAAGCA GAGTGGGTCT 
10051 CATCTGCCCC AAGACCCTCC TTGTCCCCTA CCTCATCTGA CCTTTCCCAC 
10101 TCCTCCCAGG TCCAATGCCT TGGAGACCTG GGACCCCACC TATTTTAAGC 
10151 ACTGCCGCTA TGAACCACAA TTCAGCCCCT ACTGTCCCGT GTTCCGCATT 
10201 GGGGACCTCG TGGCCAAGGC TGGAGGGACC TTCGAGGACC TGGCGTTGCT 
10251 GGTGGGTCCC AAGTTGGGGG CAGGGTTCCT AGAGGGCTCT GGGAGAGGGT 
10301 CCCGGGCCCA CCCACCGGTG GAAAAGCTAT GTGCTATGTG CAGGGTGGCT 
10351 CTGTAGGCAT CAGAGTTCAC TGGGATTGTG ACCTGGACAC CGGGGACTCT 
104 01 GGCTGCTGGC CTCACTACTC CTTCCAGCTG CAGGAGAAGA GCTACAACTT 
10451 CAGGTGAGGC CCCACTGCTC CCAGTGCCCA GCTGCTGGGC CCATCGCCCT 
10501 CTCACTGTGG CGGCCAGGAC AGACCACACC CAGGCCCAGG CCTCTAGATA 
10551 TTCCACTACG TGTGCAAGGG GGTCCCAGGA GCAGGAGAGA GCTGTTCTCA 
10601 ACCCCACATC CTCCAGCACA GGCTCCGTCC TGCTGCCCCA AGTCCTGAGC 
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10651 CCTCCACCCC ATCTGTCCCA GGCCCCTGCC CAGCTCAGGC TCCTCACTGC 
10701 CAGCCCTTCC TCCACCCCAC CTCGCTTCTA GTATCTCCCC TCCACAGCAA 
10751 TGGGGTGTTT CATTTTTACT TTCCCCTTCT CCCCTTCAGC TTTGTTTTTT 
10801 TTTTTTTAAG ACAGAATCTC ATTCTGTCAC CCAGGCTGGA GTGCAGTGGC 
10851 CCGACCTCGG CTCACTGTAA CCTCTGCTTC CTGGGTTCAA CCGATTCTCC 
10901 TTCCTCAGCC TCCTGAGTAG CTGGAATTAC AGGTGCTCGC CACTACTCCC 
10951 AGCTAATTTT TATATTTTGG TAGATAGAGA TGGGTTTTCA CAATGTTGGC 
11001 CAGGCTGGTC TCAAACCCCT GACCTCAGGT GATCCACCCA CCTCAGCCTC 
11051 CCGAAGGGCT AGGATTACAG ACGTAAACCA CCATGTCTGG CCTCCCTTCC 
11101 GCTTTTACCT AAACTTTTTT TTTTTTTTTG AGATGGAGTC TCACTCTGTC 
11151 GCCCAGGCTG GAGTACAGTG GCGGGATCTC AGCTCACTGC AAGTTCCGCT 
11201 TCCCGTGTTC ACGCCATTCT CCTGCCTCAG CCTCCCAAGT AGCTGGGACT 
11251 ACGGGTGCAC GCCTCCACGC CCGGCTAATT TTTGCATTTT TAGTAGAGAC 
11301 AGGGTTTCAC CATGTTGGCC AGGATGGTCT CGATCTCTTG ACCT CGTGAT 
11351 CCACCTGCCT CAGCCTCCCA TAGTGCTGGG ATTACAGGCG TGAGCCACCA 
114 01 CGCCCGACCT TTTTTTTTGA AACGGAGTTT TCACTTTCTT GTAGTCCAGG 
114 51 CTGGAATGCA ATGGCGTGGT CTTGGCTCAC TGCAACCTCT GCCTCCTGGG 
11501 TTCAGGTGAT TTTCCAGCCT CTGCCTCCAG AGTAGCTGGG ATGACAGGTG 
11551 TGCACCACCA CACCCAACTA ATTTTTGTAT TTTTAGTAGA GATGGTGTTT 
116 01 TGCCATGTTG GCCAGGCTGG TCTCGAACTT CTGACCTCAG GTGATCTGCC 
11651 CACTTCAGCC TCCCAAAGTG CTGGGATTAC AGGCATGAGC CACCAAGCCT 
11701 GTTTTTTTTG TGTTTTTTTT TTTTTTTTTT TTAGATGAAG TTTTGCTCTT 
11751 GTTGCCCAGA CTGGAGTGCA GTGGCCCGAT CTCGGCTCAC TGCAATCTTT 
11801 GCCTCTCGGG TTCAAGCAAT TCTCCTGCCT CAGCCTCCTG AGTAGCTGTG 
11851 ATTACAGGTG CACACCACCA CACCCAGCTA ATTTTTGTGT TTTTACTAGA 
11901 GATGGGGTTT CACCATATTG GTCAGGCTGG TCTCGAACTC CTGACCTCAG 
11951 GTGATCCACC TGCCTCAGCC TCCCAAAGTG CTGGGATTAC AGGTGTGAGC 
12001 CACTGTGCCT GGCCTCAAGT TTCATAAATT GCATTTATTA TCATGTCTTT 
12051 GAGTCTTCTA AGCAGATCTA TTGGATCCTT CTGCCACCGA GCGTCACCTC 
12101 GTCATGCAGG CAGGCACACA CGACCACCAG GCCTGGGGAT GATGCCCCTC 
12151 AACATAGCTC ACTGCACCCC GTCTGATCTG GCTTCCCCAA CCTCCCCAGC 
12201 CCTTCGAAAC CACGTGGGGC TGGCTCCCAC CCACATCCTG TTCCCCTGAC 
12251 CTCTGTGCTG GCAAACCACC TGTGTGCATG TTCCTTCAGG CCCAGCCTCA 
12301 TGTCCCCTCC AGGAAGTCTA CCCCAGTTCC CAGGGAAGAG TGAGTTCCCA 
12351 TCTCTGGAAT CCCTCAGCCC TGAGCCTGCC CCTTCACATC CCCCGCTGCT 
12401 GGGTCTGTTT AGGGACTCCT CTGTCCCCCG TCCTCTCAGC AGGCAGGGAA 
12451 CTTCTGAGGG ACAGGTCTTC GTTTGCTTTT TCTGTTTTCT CACCAATTAC 
12501 ATAGGGCTGA GACCCAGGAC TCAGGCTTGG GCTGGGGGTT TATAGAGTCA 
12551 ATTGACAAGT TGGACAGAGG TCTGGCAGGG CCAGCCCCAC CTGGGGGTGG 
12601 GCAAAGCAGG TCACCAGAGC CTTCTTTCCT GCCCACAGGA CAGCCACTCA 
12651 CTGGTGGGAG CAACCGGGTG TGGAGGCCCG CACCCTGCTC AAGCTCTATG 
12701 GAATCCGCTT CGACATCCTC GTCACCGGGC AGGTAGGCAC AGGTAGGGGT 
12751 CAGGCCGGGG ATGGGATGGG GCAGGCAGAC AGGGCTGGAG GAGGCATGAG 
12801 GCTGACAGTC GTGGGCTGAG AGGTTCAGCT CAGATCTCTC TCAGGCAGGG 
12851 AAGTTCGGGC TCATCCCCAC GGCCGTCACA CTGGGCACCG GGGCAGCTTG 
12901 GCTGGGCGTG GTGAGTGCGA GCACTGTGGG CACCTGCAGG CTGCAGTGAG 
12951 TGCTGCTGAC CAGGGTGTGT CCAATGCATG CTGGAGCCTC CGGTGCCTGC 
13001 ACATTGAGTC TCGGGGTGCA GGCTGGGGAG GTGGCAGGAG AGCAGGCTCG 
13051 GGGGCTGGAA CATGGGTTGG CCCTGCCTCT CCCAGGTCAC CTTTTTCTGT 
13101 GACCTGCTAC TGCTGTATGT GGATAGAGAA GCCCATTTCT ACTGGAGGAC 
13151 AAAGTATGAG GAGGTGAGCT GAGGTCGCTC TGCTTGGACC CTGGGTTCTG 
13201 CCACACTTAG GAAGATGTTG GCTGGATCCC TGACCTGCTG TCCTCATCTG 
13251 CAGGCCAAGG CCCCGAAAGC AACCGCCAAC TCTGTGTGGA GGGAGCTGGC 
13301 CCTTGCATCC CAAGCCCGAC TGGCCGAGTG CCTCAGACGG AGCTCAGCAC 
13351 CTGCACCCAC GGCCACTGCT GCTGGGAGTC AGACACAGAC ACCAGGATGG 
13401 CCCTGTCCAA GTTCTGACAC CCACTTGCCA ACCCATTCCG GGAGCCTGTA 
13451 GCCGTTCCCT GCTGGTTGAG AGTTGGGGGC TGGGAAGGGC GGGGCCCTGC 
13501 CTGGGGATCT CAAGGATGAG GCCCCAGCAT GGAGGATTGG GGGTAGAATT 
13551 CCACCCTTGA ACCCCAGCAG ACAGTCCCTC CCCTGACTCC CACCTTGGTA 
13601 GGGTGCTGCC TCAGGGAGCC ATAGAAGTCG GCTGTGTTTT GAGACGGCGA 
13651 CAGAACCTGA CCCGTGGAGA CTGGGAGAGC CCAGCAGGCA CCTGTATTGC 
13701 AGGGCTCCGA CTGCATGTGG CAGGGGCTCC TGCTGCGTCT GGGCCTGGAG 
13751 GTCTCTCTCC CAGTGCTCTG TCCCCAGTGT TCCTAGCAGA GGTATGCTTA 
13801 CCAGCTGTCA GCACAGACCC TCCTGCTGCC TGGGTCCTGG CCCTCCTCCC 
13851 CCATCTGCAC CCCCATCATA GGTAGAGACC CCACCCTCCC ATCGGTCCTA 
13901 CATGGGGCTG TGCAGCTGGA GCCAAAAAGG CAAGGTAGAA AGAGGAGTGA 
13951 TGGGGGAGGG GGATTGTTTC AGCTTCTCTG GTGCTGTGAT GCCCCAGGAG 
14 001 AGTCCTAATC TAGGGAATGG GGTGGAGTAG GCAGATAATC CACCTCCCTA 
14051 TCCCCCAGGC AAGGGCGGAG CATGTGTCTT GGGCCCACAC CTGCTTAGTT 
14101 TATGAGGACC GGCTGCTTTC CAGTGGTAGC CCTTTTGCCA TGGAGGTCTG 
14151 GGAGAGAGAG CAGAGGGCGG CAGGGCTAAG TTGGTGATCA TTGGGTTCTT 
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14201 CAGGACCTTC TATATCCCTC CTCGGTAACC CCCCAGCCCA ACCCCTTGGA 
14251 ATCTTTCCTC CAGGCTTCCT GAGAGCCCTG GGGGTGGGAG GCTGTGGGAG 

143 01 GCTGTACATC TGAAATTCAC TTCAGTCCAA GTCATACCTA GGAAGCTGTC 
14351 TGGGCAGCTG CTCGAGGGAG GCCCTGGCTC TGATCCCAGG CTGGATGGAG 

144 01 TGGCTGGAAG GAATGGTTCC AAACAACACC ACCGAGATCT CCCTCAGGCT 
14451 GGCCAGGTTT TGCAGCTGGA ATTCTCCTCT TGGTCCCAGG GCGGGGCAGG 
14501 GAATTCTAAG TGTCCACCCC AGGGAGGCAA GGGGCTGCTT TCCACTGTGG 
14551 GTACCTGGTG ATCAGGGCAA GCTGTGGAGG GCCAGGGGTG GGGCTGAGAC 
14601 TGGGCTGACA TCTAGAATCA CCTGCCACCT GGAGCCTCAG TAAAATGCCT 
14651 GGGGTCCCTG CTGCCTCTCA ATCTCCAGAG CCATGTCCAT GGGGAGGTGG 
14701 GCTCTGAAGG GCGAAGGTGG GAGAGCAGGG CCCCTGAGGC CTGGGTATCC 
14751 AAGGAGGGGC ACGTGCACCT GATTCTCCTT GGGGCCCAGA GGAAGCTGAT 
148 01 GTCATGGCTG GACAAAGTCA CGGAGTAAAG CCAGCAAAGC CACCCTCTTC 
14851 CTGTGTAGTC CTTACAGGCA TGACTGGAAA GTTGGGGGGC ATCTATGGTA 
14901 GACATGGCAC AGCCATGAAG AGACCAGTGG GGTGGTGCAG GGTGGACTTG 
14951 GGGACCCTAC CCCTGAAGAC TGAGGCCCTG CAGCTACCAG GTGGGCTAGA 
15001 AGGTAACTGG AACAGGCCTG GGCACTTGTG CACCCATGTA GGAGCATGAG 
15051 GGCCACACTC TTTTCACCTC AAAGCCCTTG AAGAGTGGGC AAAGACAGCA 
15101 AGAGAGCTGC AGCCTGGGCC CGAGCTCAGA AACAGCTGTC GCCTCAGTCT 
15151 GCGCACAGGC ATGCACCCCA GGGTAGTGCC TGCAGGGATG CATGTGTCCC 
15201 CGTGGGGGTG CCTGTGCCAG GCAGGCCTCA GGTGCATGCC ATGCTCAGAA 
15251 CCCTGCTGCC CTTTCTAGGC AGCCTCCTTG GGGCCCAAGC TCTGCTCCCT 
15301 GGATCTGCCA CCTAGCAGAC GTGGGGAGCC TGACCCCATG CCTGTCATGG 
15351 AACCCTCCTT GCCTGGTGTG TGTGGCTCCC CTCTTCACTG GGCACCTGGA 
154 01 TCCAGGCCCA CCTGTGTCCC TGACTCAGGG TGGTCCCAGG ACTGGCACCT 
15451 ACTCTTTAGA GAGCCCCAGC ATCTTTGATG TGGATTGGAG ACAATTGCCT 
15501 GGTTCCCTGG GGCAGGTGAA GACTTGGTGC CACAAAGAAT GCCACAGTGG 
15551 ATACGCCAGC AGGCCACATG GCTGGCCAAG CAATTATTAT TATGGATCCC 
15601 TTGGGCTGTG GGCCTTCCCA TCCACCCCAC CACAACTGCC CAGGTAGCTG 
15651 GAGCTGATCA TAAACAAGAA GGCTCTGGGC AGAGTCCATG GCACCAGCAC 
15701 CAGCCAAGGC CCACTCCTGA AGACCCGAAG CCCAGCCCCT GGATGAAGGT 
15751 CCTAAGGTCC TGAGGACTCC CCAGCCTGTG CAGGCCTGCA AACCCAGGCT 
15801 GCCCACAACA GAAGGGGCTC TCGGC TTGTC TGGCCTCTCT GGCCTCCCAA 
15851 GCAGGTGTGG GAGGGCGGGG CAAGTGTGGG CTGATCAGCT ACTCCATATG 
15901 GCCAGGGTCC TGTGCTGGTG CCTGGCTGGG GGGCTGCATA GCCTGCACTG 
15951 TCTCCTCCAG GCTGCCCCTG GGGAATACCA CGTAGTGTGT GGAGTTCAGC 
16001 CCTGGCAGCT CCCGCTGGTT CTCCTTGCTA TGCCGGATGC CATAGCCGAA 
16051 ATACACTGCA AGTCCTAGAC AGGGCAGGAG GCAGGGCATG AGCCTGAGGT 
16101 ACAGGTTCCA GCCCTTCCTG TCCTCTTTGC CCTCCTCCTG ACCCCGGTCC 
16151 CAGCCTGGCC CCCACTCACC CATCAGCAGC CAGATGGAGA AGCGCACCCA 
16201 GGTCAGATAG CTAAGTTTCA GCATGAGGCA GATGTTGAGG ACGATGCTCA 
16251 GGGCTGGAAT CAGGGGAACC ATGGGGATCT GAGGAGGCAG AGGCAGGGCA 
16301 GGGCTGGGCC GGGCTGCAGG AAAGATCTGC CAGCCCAGGG CTCACTTTCT 
16351 CGGGAATCCA TAGAGCCTTT GTTCCTCACG GGAGATTGTG GAGACATGTG 
16401 CTCACTCACC ATGCAGAAAG GGGTGCGGGA TGGGTGTGTG GTCCTCCCC 
(SEQ ID NO: 3) 
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Exon: 13254-13448 
Stop: 13449 



SNPS: 



DNA 





Major 


Minor 






Q 


T 


■Ro-vrrvnH OPT? / £ ' \ 


OKI 


T 




■RoirrvnH HUT? / CM 






T 
1 




2000 




Q 




2222 


G 


£ 


Infc ron 


2783 




T 




3199 


G 


A 


Int iron 


3307 


C 


G 


Intron 


5012 


C 


G 


Int r on 


6169 


G 


C 


Intron 


7647 


A 


G 


Intron 


8638 


C 


T 


Intron 


9409 


T 


G 


Intron 


10504 


A 


C 


Intron 


10971 


T 




Intron 


12609 


G 


A 


Intron 


13367 


T 


A 


Exon 


14191 


T 


C 


Beyond 0RF{3') 


14227 


A 


G 


Beyond 0RF(3') 


15027 


T 


C 


Beyond 0RF(3') 


15441 


A 


C 


Beyond 0RF(3 f ) 



Protein 

Position Major 



Minor 



21 



378 



Context : 
DNA 

Position 

136 TC TCCAAGTCC ATGGGTGCCTGGTAGGAGACAGGGGG ATGAATGTGAAC CCCTGC ATGG C 

T ATAGCC ACCTGCCTCC TC C CCTGCCC TGCATCACTACCTGGCCTATTTTTTG CCTCTAG 
AAGCACTGCTTCCTA 
[C,T] 

GCTCCTTAGGACC ACTGCC CGC ATATGACAGAT AAGAAC AT CGAGGCTAAGGC AACGC AA 
AT CTTTTC CTTAAAGTCAT AC AG CTGTCAAAAG AAAGCTGGACAACCTGGGCAACATAGC 
GAGATAAAAAATTATTTAAATTAGCCAGATGTGGTAGCCCCCTGTAGTCTCAGCGACTCA 
GGAGGCTGAGGCAGGAGGCTCACCAGAGTGCAGAGTTCAAGGATGCAGTGAGCTATGATC 
CTGC C ACTGC ACTG AAAGCTGGGTGACAGAGC AAGACCCTGGCT CT AATAAATGAAT ACA 

253 TCTCCAAGTCCATGGGTGCCTGGTAGGAGACAGGGGGATGAATGTGAACCCCTGCATGGC 
TATAGCCACCTGCCTCCTCCCCTGCCCTGCATCACTACCTGGCCTATTTTTTGCCTCTAG 
AAGCACTGCTTCCTATGCTC CTTAGGACC AC TGCC CGC ATATGACAGATAAGAAC ATCGA 
GGCT AAGGCAACGC AAAT CTTTTCCTT AAAGT CATACAGCTGTCAAAAGAAAGCTGGAC A 
ACCTGGGCAACA 
[T,CJ 

AGCGAGATAAAAAATTATTTAAATTAGCCAGATGTGGTAGCCCCCTGTAGTCTCAGCGAC 
TCAGGAGGCTGAGGCAGGAGGCTCAC CAGAGTGC AGAGTTCAAGGATGC AGTGAGC TATG 
ATCCTGCCACTGCACTGAAAGCTGGGTGACAGAGCAAGACCCTGGCTCTAATAAATGAAT 
ACATAAAGTCTCACAGCTAGTGGTAGCTAATCCTGCCAGAGTCAGGCCTCTACCTGTCTG 
ATGACAAATGGCACACTATGTCTTTTAACCTGATTGCAGACCACAAATGTTTTGTGAATA 

573 TAAATTAGCCAGATGTGGTAGCCCCCTGTAGTCTCAGCGACTCAGGAGGCTGAGGCAGGA 
GGCTCACCAGAGTGCAGAGTTCAAGGATGCAGTGAGCTATGATCCTGCCACTGCACTGAA 
AGCTGGGTGACAGAGCAAGACCCTGGCTCTAATAAATGAATACATAAAGTCTCACAGCTA 
GTGGTAGCTAATCCTGCCAGAGTCAGGCCTCTACCTGTCTGATGACAAATGGCACACTAT 
GTCTTTTAACCTGATTGCAGACCACAAATGTTTTGTGAATATTTTCCCCAGGGAAAAAAC 
[C,T] 

GGAAGT AGTT CTAAATTCT ATAC ATCC ATTATATTAGTTTTAC CTGTGGATTGGG AAAAC 
CCAGCTCTGATTGCATTTCAGGGCGGGACAGCCTTTGGTGCACTGTCTGGCGGGATTTTC 
CATTTTAACCTCCTTCTAGAAGCGCCTTCTCATGGTAAAGTTCCTGATGCCGCCAGGAGC 
GCCGAGGAGAGGGCAGGGGGCTGGAGACGCCCCGCAGAGGGCTACGTGCCCTGCTGGACA 
GAGGTCTCCTGCCTCCTCGGCGGCGCCAGCCCACCTCCCACAACCCCTGCGGGAGAAGCC 



FIGURE 3, page 6 of 10 



2000 CTCCTCTCACAGCACTGATAACAGCTGTCCGTCTCCACCCTCCCACCACCTCCACTCCCA 
CCCCAGGAAGTGAGGCCAGAGGGCAGGGACAGAGCTGCTGCTGTTCTCTGTGTGCCAGGG 
CCCAGCAAAGGGAATGTAGGGAGGGTGGGAGGTGCAGGGCAGCTGGGATTAGGGGTTGAG 
GGCTGGGTGTTGGAGGCTGGATCTGGATCCTGCTTTAGTGGAAGTGTCCCTTTAACAGCA 
ACTGGCCTGGCCTGGCTCGGGCCCTGCTTTGCCTCCTGTTCAGCTGCGGCTGCAGCTGCC 

[A,G] 

TGCTGACTCATGTGCCCGCAGCTAGCAGGAGCTGGCAGCATGGGCTCCCCAGGGGCTACG 
ACAGGCTGGGGGCTTCTGGATTATAAGACGGAGAAGTATGTGATGACCAGGAACTGGCGG 
GTGGGCGC CCTG C AG AGG CTG CTGCAGTTTGGGATCGTGGT CT ATGTGGTAGGGTAAGAG 
AGAAGAGCTTTTGGCCAGGCTGGAGGGG CAAGGGAAGAGGTGGGGGGTGGGG CTTGGTC C 
TGCTGGGTTGAAGTTGAGGGTTGGGCTGTTTAGGGGCTGGAGTGGAAGGGGGCAGATTGG 

2222 AGTGTCCCTTTAACAGCAACTGGCCTGGCCTGGCTCGGGCCCTGCTTTGCCTCCTGTTCA 
GCTGCGGCTGCAGCTGCCATGCTGACTCATGTGCCCGCAGCTAGCAGGAGCTGGCAGCAT 
GGGCTCCCCAGGGGCTACGACAGGCTGGGGGCTTCTGGATTATAAGACGGAGAAGTATGT 
GATGACCAGGAACTGGCGGGTGGGCGCCCTGCAGAGGCTGCTGCAGTTTGGGATCGTGGT 
CTATGTGGTAGGGTAAGAGAGAAGAGCTTTTGGCCAGGCTGGAGGGGCAAGGGAAGAGGT 

[G,C] 

GGGGGTGGGGCTTGGTCCTGCTGGGTTGAAGTTGAGGGTTGGGCTGTTTAGGGGCTGGAG 
TGGAAGGGGGCAGATTGGGACGGGGTTGGGGAGAGCTAGGCGATACAAGACAGGAGAGCA 
AGAACAAGCTGTGTGTTTGTCCTGTGTGTCCACTTGCCTCCTTCCCAGGCCCCCACCCAG 
G CCCCAC C CAGGGGG CAC ATGAC AT AGTCCTTAACATCTGTGAG AGCTGGAGCAC TAGGC 
C CCC AGAGAGAC CACCAGCTGTATCTCGGGT CAGGAGAGT CTGT AAGGGGGAAGCTGGAT 

2783 GTATCTCGGGTCAGGAGAGTCTGTAAGGGGGAAGCTGGATCTAGTCAGGCTGGGGGTGGG 
TGCTGGCTAGTGAAGGTGATTGT CTGAGGG CATTGGCTCT CTGATGC ATGGCTGGAGCTT 
CTGTCTCATTCAGGGGGTCTGGAGTGGGAAGTGGGGCCAGAGAGGAGGTGGGGCCTTCGA 
TGTTGGGC CGGGAGC CTGTAGGGTGTGGGGGGAGAAC TGAG CATGTAGGGCT C AGCTCCG 
CCCCTGTCACTACACGCTGGGGACACACCACACTGCCCGACTTCTCCTCCCCAGGTGGGC 

[G,T] 

CTCCTCGCCAAAAAAGGCTACCAGGAGCGGGACCTGGAACCCCAGTTTTCCATCATCACC 
AAACTCAAAGGGGTTTCCGTCACTCAGATCAAGGAGCTTGGAAACCGGCTGTGGGATGTG 
GCCGACTTCGTGAAGCCACCTCAGGTGGGGGCCCTGATGTTGCTGACGGGGGCGCAAGTC 
CTTT CCCCACTGACAGCCTGAACACCCGC GATGCAGCCAGTGTGTGCGAGAGAGAAGCAT 
GTGATGCCAGAGACGGCTGCGGGTTCTCAGGAAGGGCTTCACAGAGGAGTGGCACCTGGA 

3199 ATGTGGCCGACTTCGTGAAGCCACCTCAGGTGGGGGCCCTGATGTTGCTGACGGGGGCGC 
AAGTCCTTTC CCC ACTGAC AGC CTGAAC ACCCGCCATGCAGCCAGTGTGTGCGAGAGAGA 
AGCATGTGATGC CAGAGACGGCTGCGGGT TCTCAGGAAGGGCT TCACAGAGGAGTGGCAC 
CTGGACAGGACTTTCAGGGATGTGTAGGAGGTTTTGGGGTGGAAAAAGGGGCCACTCAAG 
AAGC CAGGCCAGGGTTGGACGTGCTGG CTC ACGC CTGTAAT CCCAGC ACTTTGGGAGGC C 
CG,A1 

AGGCAGGTGGATCACGAGATTGAGAGTATCCTGGCTAACACGGTGAAACCCCATCTCTAT 
TAAAAATACAAAAAATTAGCCGGGCATGGTGGTGGGCGCCTGTAGTCCCAGCTACTCGGG 
AGGCTGGGGCAGGAGAATGGCATGAACCCGGGAGGTGGAGCTTGCAGTGAGCCGAGATTG 
CACCACTGCACTCCAGCCTGGGTGGCAAAGCGAGACTCTGTCTCAAAAAAAAAAAAAAAA 
AGCCAGGC CAGAGAAAC TGC ATT TCCAAAGACTG CC AAC AGAAAAGAAGGGAGTGT CC AG 

3307 GTGCGAGAGAGAAGC ATGTGATGC CAGAGACGGCTGCGGGTT CTC AGGAAGGGCTT CAC A 

GAGGAGTGGCACCTGGACAGGACTTTCAGGGATGTGTAGGAGGTTTTGGGGTGGAAAAAG 
GGGCCACTCAAGAAGCCAGGCCAGGGTTGGACGTGCTGGCTCACGCCTGTAATCCCAGCA 
CTTTGGGAGGCCGAGGCAGGTGGATCACGAGATTGAGAGTATCCTGGCTAACACGGTGAA 
ACCCCATCTCTATTAAAAATACAAAAAATTAGCCGGGCATGGTGGTGGGCGCCTGTAGTC 

[C,G] 

C AGCTACT CGGGAGG CTGGGGCAGGAGAATGGCATGAACCCGGGAGGTGGAGCTTG CAGT 
GAGCCGAGATTGCAC CACTGCACTCC AG CCTGGGTGGCAAAG CG AGACTCTGT CTC AAAA 
AAAAAAAAAAAAAGCCAGGCCAGAGAAACTGCATTTCCAAAGACTGCCAACAGAAAAGAA 
GGGAGTGTCC AGGACTAATGGC TTGAGCTTGAGAGTGGT GTGAGGTG CTGGGGC ATGGAA 
CTTCCCTGTAGCCCTGCTCCCTGACCTGGGGCACTACGGTCAGGTGCTGCTCCTCCCCTC 

5012 TTAATGACTTGATGGGGCCAACATCCCTTCCCTCATAAACCAGGCTGCCGGCTTCCGGCC 
TTTCCAGTCAACACGAGCCCAGCCAGGCCAACCTTGAGACTTGCCTCCTAGGGAGAGAAC 
GTGTTCTTCTTGGTGACCAACTTCCTTGTGACGCCAGCCCAAGTTCAGGGCAGATGCCCA 
GAGGTGAGTTTACCCAGGATCCTCCCAGCGGGTCCCTTGTTCCTCCATCAGCCCCAGGTG 
GCCACCCGTGTTTCCCTTTCCCCTTCCCAGGTGGCTGAAGGCTCAGCCTGTGCTCGGTGT 

[C,G] 

CCCCAGGCACTGGGCTACATCTTTTCCTGAATCATTATGTTCAGTCTTCACATATCCCCT 
GCCTGGT AGGAAGTC CTGTGATCCCCATTTCAGAGGAG AAGACTGAGGCT C AGTG AGGT T 
GAGTCACTTTCTTAAGGCCTCCAGGCCTGTGGGTGACAGGACCCCGAGCTCTGGGCAGCA 
GCAGTTCCCATGAGGTGTCCAGGCCCTCCCATCCTGGTCCTGCCTCTGGGTACTCTCCAG 
GTTGGTAGTGTGACACCCAGAGCTGCGCACATGCTCAGGGAGGTTCTAATAGCAAGAGCC 
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6169 CTTGCCTGGGGATGTCCCTGGGATCCTGCATCTGTCACAGAGCATGCTCATTCTCTCCAG 
C TGTGAATTTTGTTTGAACT ATTGGG ACT CAGGACATAGTC CTGAAAGTT TACCT CC AC A 
GTGACATCTTTAGGCAAGTCCAACATTTACGTGCCTCCTGGGCTGGAGGGTCGTTGTGCA 
GACAGCTGTCCCCTGAGCCCTGGTGGCTGGTCCTAGCACAGTTGCTGGAGACATCCCATG 
TCCGTAGTTGGAAATATGCACAAAGGATTGCTTACTCTTTTTGTTTGTTTGTTTTTTTGA 

EG,C] 

ATGGAGTCTTGCTCTTGTCCCCAAGGCTGGAGTTCAATGGCACGATCTCGGCTCACTGCA 
ACCTCCGCCTCCTGGGTTCAAGCAGTTCTCCTGCTCACCCCCTGAGTAGCTGGGATTACA 
GGTGCCCGCCACTGTGCCCAGCTAATTTTTGTATTTTAAGTAGAGACGGGGTTTCACCAT 
GTTGGCCAGGCTGGTCTCGAACTCCTGGCCTCAGGTGACCCACCAGCCTCGGCCTCTCAA 
AGTGCTGGGATT ACAGG CGTGAG CCTGCC GAGAGCTTGGT CGGGGAGAC CTGAAC C CAGC 

7647 AGGTGGCAGGTCCGATGATGGGACAGAGGCTGTAGGTGGGGGACCTAGGGCTGCACTTGA 
GCAGAATCTTTTTTTTTTTTTTCTTTTTTTTTTTTTTGAGACAGAGTCTCGCTCTGTCAC 
CCAGGCTGGAGTGCAGTGGCGTGATCTCGGCTCACTGCACACCTCCACCTCCTTGGTTCA 
AG CGATTCTC CTGCCTCAGCCT CCCAAGT AGGTGGGACT ACAGGC ACAC AC CACC ACACT 
CGGCTAATTTTTGTATTTTTAATAGAGACAGGGTTTTGCTGTGTCGGCCAGGCTGGTCTC 

[A,G] 

AACTCCTGACCTCAGGTAATCCGCCCACCTTGGCTTCTCAAAGTGTTGGGATTACAGGTG 
TGCCAGGCCAAGCAGAATCTTAAAAAAAGGTGGGGAGAAGCTGGTGAGCAGGTGGATTTG 
GTTGAAGCAGGATGTCGACACAGAGGGGGCTTGGTGGGTAAAGGCCCTGAGCTGTGTGAG 
GTGAGGTGCCTTTAGGGCTACCTGCCACTGGGTGGAGCTGAAGTGAAGATTTGGACTGGG 
GTGGGAAGAAGGTAGTT C AGGATTTC AGGGGCCC CTGTAAGC C CC ACT AAGGAGCT AAAC 

8638 ACAAAGAAGCAGAGCATGTGGCTCTGCTCCGACCTCCACCCAATCACGACGGCCCTGTCT 
TTCAGAAAGTCCCACCGCCTCATTCTGGCTTCTCAGAGGCCCTCAGCCTTCCTTGCGCCC 
CTGGTGCTGGTGTTCTTCCTGCTGCCCCTGAGCTGAGTGCCCTGGGCAGCAGTGTCCATC 
CTCAGTTGGGGCAGGACCATGCCTGGGAGAGTGCCCGATGCTCAAGGGTGCCTTCGTCTC 
TGGGGTCTGGGACCCCAGAAAGCTCACCTGTCCTCCCCTTCTGCCAGAGCCCCATAGTCC 

ATGC CTCTGTGCAGG CATTAATGT C CC CAGGTT ACAGAAGAGCGAGCAGGAAGGAGTAGC 
CTGTGGTCCCTCAGCAAGGGTGTGGGGTCCTGCTTCAATACCCAAGCCCCTGACTCTAGG 
GCCCTGATCTTTGTCAGCTATGTCCCCATGCCGGGCATCAAAAACTCACCCTCCCAAGGT 
AT CTTC ACCTTCC CTGATCTGTCAT CC AAATTGGACCAGAGGAGCTAGACCTGGAAGAAT 
CACTTCCGCAT CCACC AGGGAC AGAACTGT CAGGAGGGAAGGGG CAGGGTGCGTTGT CTC 

9409 TGAGGTGAGAGGATTGCTTAAGCCCGGGAGGGCGAGGCTGTAGTGAGCCATGATCATACC 
ACTGCACTAGAGCCTGGACAACAGAGTGAGACCGAATCACTAAAAATAAATTTTTTGAAA 
AAGGAGGAAAGGGGTCTCC CTTTGT CTT TGAAAT ACAGT ACTGT ACCTT CAT CTGGCC AG 
GGCATTGCTCCGCTCCCTCCTCTGACCACCTCCTTTTATTTGCACCCTCCAGCTTTCCTG 
TGTGGCCCCACACTCAGGGTACTCTGGCGGCGGGGTGGTGAGGTTGTTTAAGGTGGGAAG 

[T f G] 

GGGCCTGTCCTTCCCACCTTGAACCTCCCTGCCTTTGAGACTGGGCTGTGGAGGGGAGAC 
ATCCCCTGTGCCATTGGTGACTGCTCTCTCTCCCACCTCAGCACCCGTCCGTCCCACTGG 
C TAACTGCTGGGT CGACG AGGACTG CC C CGAAGGGGAGGGAGGC ACAC ACAGCCACGGT A 
ACTGTGGGCTCTGTCTTCCAGTGCCCCTAGCAGGGTGGGGGCCGGGCTGGGATCCTGGGT 
GGCTCCTGAGTGCAGGCCCTGCTCGCCTCTGTCCCTGCATCTCTCTTTCTGCCAACAACC 

10504 GACCTCGTGGCCAAGGCTGGAGGGACCTTCGAGGACCTGGCGTTGCTGGTGGGTCCCAAG 
TTGGGGGCAGGGTTCCTAGAGGGCTCTGGGAGAGGGTCCCGGGCCCACCCACCGGTGGAA 
AAGCTATGTG CT ATGTGC AGGGTGG CT CTGT AGGC AT C AGAGTTC ACTGGGATTGTGACC 
TGGACACCGGGGACTCTGGCTGCTGGCCTCACTACTCCTTCCAGCTGCAGGAGAAGAGCT 
ACAACTTCAGGTGAGGCCCCACTGCTCCCAGTGCCCAGCTGCTGGGCCCATCGCCCTCTC 

[A,C] 

CTGTGGCGGC C AGGACAGACCAC AC CC AGGC CC AGGCCTCTAGATATT CCACTACGTGTG 
CAAGGGGGTCCCAGGAGCAGGAGAGAGCTGTTCTCAACCCCACATCCTCCAGCACAGGCT 
CCGTCCTGCTGCCCCAAGTCCTGAGCCCTCCACCCCATCTGTCCCAGGCCCCTGCCCAGC 
TCAGGCTCCTCACTGCCAGCCCTTCCTCCACCCCACCTCGCTTCTAGTATCTCCCCTCCA 
CAGCAATGGGGTGTTTCATTTTTACTTTCCCCTTCTCCCCTTCAGCTTTGTTTTTTTTTT 

10971 GGCCCCTGCCCAGCTCAGGCTCCTCACTGCCAGCCCTTCCTCCACCCCACCTCGCTTCTA 
GTATCTCCCCTCCACAGCAATGGGGTGTTTCATTTTTACTTTCCCCTTCTCCCCTTCAGC 



CCGACCTCGGCTC ACTGT AACCTCTG CTT CCTGGGTTCAACCGATTCTCCTTC CTC AGCC 
TCCTGAGTAGCTGGAATTAC AGGTGC TCGC CACTACT C CCAGCTAAT TTTTAT ATTTTGG 

[T,-] 

AGATAGAGATGGGTT TTC ACAATGTTGG CCAGGCTGGT CTCAAACCCCTGAC CTCAGGTG 
AT CC AC CCAC CTCAGC CTC CCGAAGGGCTAGGATT ACAGACGTAAACC ACCATGTCTGGC 
CTCCCTTCCGCTTTTACCTAAACTTTTTTTTTTTTTTTGAGATGGAGTCTCACTCTGTCG 
CCCAGGCTGGAGTACAGTGGCGGGATCTCAGCTCACTGCAAGTTCCGCTTCCCGTGTTCA 
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CGCCATTCTCCTGCCTCAGCCTCCCAAGTAGCTGGGACTACGGGTGCACGCCTCCACGCC 



12609 CCAGGAAGTCTACCCCAGTTCCCAGGGAAGAGTGAGTTCCCATCTCTGGAATCCCTCAGC 
CCTGAGCCTGCCCCTTCACATCCCCCGCTGCTGGGTCTGTTTAGGGACTCCTCTGTCCCC 
CGTCCTCTCAGCAGGCAGGGAACTTCTGAGGGACAGGTCTTCGTTTGCTTTTTCTGTTTT 
CTCACCAATTACATAGGGCTGAGACCCAGGACTCAGGCTTGGGCTGGGGGTTTATAGAGT 
CAATTGACAAGTTGGACAGAGGTCTGGCAGGGCCAGCCCCACCTGGGGGTGGGCAAAGCA 
[G,A] 

GTCACCAGAGCCTTCTTTCCTGCCCACAGGACAGCCACTCACTGGTGGGAGCAACCGGGT 
GTGGAGGCCCGCACCCTGCTCAAGCTCTATGGAATCCGCTTCGACATCCTCGTCACCGGG 
CAGGTAGGCACAGGTAGGGGTCAGGCCGGGGATGGGATGGGGCAGGCAGACAGGGCTGGA 
GGAGGCATGAGGCTGACAGTCGTGGGCTGAGAGGTTCAGCTCAGATCTCTCTCAGGCAGG 
GAAGTTCGGGCTCATCCCCACGGCCGTCACACTGGGCACCGGGGCAGCTTGGCTGGGCGT 

13367 TTGGCCCTGCCTCTCCCAGGTCACCTTTTTCTGTGACCTGCTACTGCTGTATGTGGATAG 
AGAAGCCCATTTCTACTGGAGGACAAAGTATGAGGAGGTGAGCTGAGGTCGCTCTGCTTG 
GACCCTGGGTTCTGCCACACTTAGGAAGATGTTGGCTGGATCCCTGACCTGCTGTCCTCA 
TCTGCAGGCCAAGGCCCCGAAAGCAACCGCCAACTCTGTGTGGAGGGAGCTGGCCCTTGC 
ATCCCAAGCCCGACTGGCCGAGTGCCTCAGACGGAGCTCAGCACCTGCACCCACGGCCAC 
[T, A] 

GCTGCTGGGAGTCAGACACAGACACCAGGATGGCCCTGTCCAAGTTCTGACACCCACTTG 
CCAACCCATTCCGGGAGCCTGTAGCCGTTCCCTGCTGGTTGAGAGTTGGGGGCTGGGAAG 
GGCGGGGC CCTGCCTGGGGATCTCAAGGATGAGGCCC CAGCATGGAGGATTGGGGGTAGA 
ATTCCACCCTTGAACCCCAGCAGACAGTCCCTCCCCTGACTCCCACCTTGGTAGGGTGCT 
GCCTCAGGGAGCCATAGAAGTCGGCTGTGTTTTGAGACGGCGACAGAACCTGACCCGTGG 

14191 ATCGGTCCTACATGGGGCTGTGCAGCTGGAGCCAAAAAGGCAAGGTAGAAAGAGGAGTGA 
TGGGGGAGGGGGATTGTTTCAGCTTCTCTGGTGCTGTGATGCCCCAGGAGAGTCCTAATC 
TAGGGAATGGGGTGGAGTAGGCAGATAATCCACC TCCCTATC CC CCAGGCAAGGG CGGAG 
CATGTGTCTTGGGCCCACACCTGCTTAGTTTATGAGGACCGGCTGCTTTCCAGTGGTAGC 
CCTTTTGCCATGGAGGTCTGGGAGAGAGAGCAGAGGGCGGCAGGGCTAAGTTGGTGATCA 
[T,C] 

TGGGTTCTTCAGGACCTTCTATATCCCTCCTCGGTAACCCCCCAGCCCAACCCCTTGGAA 
TCTTTCCTCCAGGCTTCCTGAGAGCCCTGGGGGTGGGAGGCTGTGGGAGGCTGTACATCT 
GAAATTCACTTCAGTCCAAGTCATACCTAGGAAGCTGTCTGGGCAGCTGCTCGAGGGAGG 
CCCTGGCTCTGATCCCAGGCTGGATGGAGTGGCTGGAAGGAATGGTTCCAAACAACACCA 
CCGAGATCTCCCTCAGGCTGGCCAGGTTTTGCAGCTGGAATTCTCCTCTTGGTCCCAGGG 

14227 AAGGCAAGGTAGAAAGAGGAGTGATGGGGGAGGGGGATTGTTTCAGCTTCTCTGGTGCTG 
TGATGCCCCAGGAGAGTCCTAATCTAGGGAATGGGGTGGAGTAGGCAGATAATCCACCTC 
CC TATC CCCCAGGC AAGGGCGGAGCATGTGTCTTGGGCCCACACCTGCTTAGTTTATGAG 
GACCGGCTGCTTTCCAGTGGTAGCCCTTTTGCCATGGAGGTCTGGGAGAGAGAGCAGAGG 
GCGGCAGGGCTAAGTTGGTGATCATTGGGTTCTTCAGGACCTTCTATATCCCTCCTCGGT 
[A, G] 

ACCCCCCAGCCCAACCCCTTGGAATCTTTCCTCCAGGCTTCCTGAGAGCCCTGGGGGTGG 
GAGGCTGTGGGAGGCTGTACATCTGAAATTCACTTCAGTCCAAGTCATACCTAGGAAGCT 
GTCTGGGCAGCTGCTCGAGGGAGGCCCTGGCTCTGATCCCAGGCTGGATGGAGTGGCTGG 
AAGGAATGGTTCCAAACAACACCACCGAGATCTCCCTCAGGCTGGCCAGGTTTTGCAGCT 
GGAATTCTCCTCTTGGTCCCAGGGCGGGGCAGGGAATTCTAAGTGTCCACCCCAGGGAGG 

15027 AGGGCCCCTGAGGCCTGGGTATCCAAGGAGGGGCACGTGCACCTGATTCTCCTTGGGGCC 
CAGAGGAAGCTGATGTCATGGC TGGACAAAGTCACGGAGTAAAGCCAGCAAAGCC ACC CT 
CTTC CTGTGTAGTCCTTAC AGGC ATGACTGGAAAGTTGGGGGG CAT CTATGGTAGACATG 
G CACAGCCATGAAGAGACCAG TGGGGTGGTGCAGGGTGGACTTGGGGACC CTACCCCTGA 
AGACTGAGGCCCTG CAGCT AC CAGGTGGGCTAGAAGGT AACTGGAACAGGC CTGGGCACT 
[T,C] 

GTGC ACCC ATGTAGGAGCATGAGGGCCACACT CTTTTCACC T C AAAGCC CTTGAAGAGTG 
GGCAAAGACAGCAAGAGAGCTGCAGCCTGGGCCCGAGCTCAGAAACAGCTGTCGCCTCAG 
TCTGCGCACAGGCATGCACCCCAGGGTAGTGCCTGCAGGGATGCATGTGTCCCCGTGGGG 
GTGCCTGTGCCAGGCAGGCCTCAGGTGCATGCCATGCTCAGAACCCTGCTGCCCTTTCTA 
GGCAGCCTCCTTGGGGCCCAAGCTCTGCTCCCTGGATCTGCCACCTAGCAGACGTGGGGA 
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15441 GCCTCAGTCTGCGCACAGGCATGCACCCCAGGGTAGTGCCTGCAGGGATGCATGTGTCCC 
CGTGGGGGTGCCTGTGCCAGGCAGGCCTCAGGTGCATGCCATGCTCAGAACCCTGCTGCC 
CTTTCTAGGCAGCCTCCTTGGGGCCCAAGCTCTGCTCCCTGGATCTGCCACCTAGCAGAC 
GTGGGGAGCCTGACCCCATGCCTGTCATGGAACCCTCCTTGCCTGGTGTGTGTGGCTCCC 
CTCTTCACTGGGCACCTGGATCCAGGCCCACCTGTGTCCCTGACTCAGGGTGGTCCCAGG 

[A,C] 

CTGGCACCTACTCTTTAGAGAGCCCCAGCATCTTTGATGTGGATTGGAGACAATTGCCTG 
GTTCCCTGGGGCAGGTGAAGACTTGGTGCCACAAAGAATGCCACAGTGGATACGCCAGCA 
GGCCACATGGCTGGCCAAGCAATTATTATTATGGATCCCTTGGGCTGTGGGCCTTCCCAT 
CC ACCCCACCACAACTGCC CAGGTAGCTGGAGCTGAT CATAAACAAGAAGGCT CTGGGC A 
GAGT CCATGGCACC AGC AC CAGCCAAGG CCC ACTCCTGAAGACC CGAAGC CCAG CCC CTG 

Chromsome map: 

Chromosome No: 22 



FIGURE 3, page 10 of 10 



